Day 16 - Regular expressions - Multiple matches

52

$ grep -E "[a-z]{3}" examples.txt

dog

cat

elephant

ostrich

Dug the Dog

beholder

[...]

matches exactly three adjacent lowercase letters. Is is worth noting that the patterns do not overlap:

for example, in elephant the regular expression matches ele and pha only, and not ele, lep, eph,

and so on. Once a pattern has been matched it is skipped to continue the line analysis. This is pretty

clear if you use the -o option of grep that we learned previously, which outputs only the matching

part of the string

$ grep -Eo "[a-z]{3}" examples.txt

dog

cat

ele

pha

ost

ric

the

beh

old

[...]

The braces allow you to specify ranges, so {n,m} means from n to m repetitions, {n,} means n or

more, and {,m} form zero to m. For example

$ grep -E "[a-z]{8,12}" examples.txt

elephant

beholder

aardvark

direwolf

manticore

basilisk

matches all lines containing at least one word made of 8 to 12 lowercase letters. Two specific ranges a

re used very often, namely {1,} and {0,}. The first matches one or more repetitions of a component,

and it is useful when you know that at least one occurrence will be there, but you are unsure of the

upper limit. The second one, instead, is used to match possible occurrences of a component. As these

two ranges are very important, there is a special syntax for them. {1,} can be written +, while {0,}

can be written *. So, the code